THE ALPHA-PHONEME-PHONOID 


Towards experiments on the ‘Ultimate ‘Unit’ of 
Music and Speech” 


C. R. Sankaran and K. S. Sampat 


A proposed attempt within our theoretical framework to investigate 
the alpha-phonoid — the “Ultimate ‘Unit’ of Music and Speech” — is 
presented in this paper. 


The alpha-phoneme is a Dedekind—‘cut’ which points to the inter- 
phenomenon which is neither space nor time separately but being both, is 
especially the “punctiform origin of time’.t . 


The vistas claimed to have been opened up by the Alpha-Phonoid 
theory have been already discussed by B. Chaitanya Deva with particular 
reference to the tambura and music? It has also been pointed out that 
the alpha-phoneme is a symbol of experience.3 


In music and speech, we meet with both the non-observed and non- 
observeable.+ From a so-to-say, dimensionless experience (the alpha- 
phoneme)s we are descending, as it were, towards an attempt to define an 
ultimate signal a ‘Unit’ of environment or ‘Unit’ of communication (the 
alpha-phonoid). Both in Music and Speech, we meet with structures of 
organisations at various ‘levels’. Psychoacoustically, one could order 
structures from physicochemical undulations of acoustics to the psychologi- 
cal (i.e. mathematical) construction of phonemes (or notes) and to the ulti- 
mate alpha-phoneme whence we may derive the alpha-phonoid. 


Now we have also discussed already alpha-phonoid of first order 
(which is of order of time or time itself) and the alpha-phonoid of secondary 
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order (as an expedient unit of temporal measurement), In the wake of this 
discussion, we pointed out both the ‘motivational’ and the ‘activational’ 
levels in the process of speech,° which may be true of music also. The 
motivational level represents the ‘inner speech (or music, if you will) and its 
predicative function.’7 


The first assumption behind the ultimate ‘Unit’ of music and speech 
is that a change in the time-constant T is necessary for any discrimination 
of duration, frequency or intensity. Nowa detailed mathematical demons- 
tration of this assumption is a mater of urgency. 


At the very out set,itmust also be pointed out that this assumption rests 
on the work of Mol and Uhlenbeck.8 The simultaneous presence of change 
in the time-constant of the inner process and a change in duration, fre- 
quency or intensity in the outer process, that is respectively both in the 
neurolophysiological and acoustical media, is the very basis for the inductive 
postulation of the existence of the continuum due to Veronese in perceptual 
processes. We have already said enough of the deductive postulation of 
such a continuum in the wake of the alpha-phoneme theory.? 


The first condition in all our investigations, concerning the ‘‘Ultimate 
‘Unit’ of Music and Speech” is that duration, frequency and intensity 
should be reducible as a result of their bi-unique correspondences, through 
transformation equations in terms of T;, Ta and Tr. 


The time-constant T should be taken as being equivalent to an observer 
and if description of speech or music are made in terms of times t (=T; + 
Ti%) )+ Tr§.......) and T by an equation, such an equation should 
not differ in form from a description of the same phenomenon in terms of 
V and: T’!o, 


It should be possible to establish transformational equations between 
t, T and t’, T’. There will be a quantity X which will be an invariant for 
all observers T, T’ etc.,!! This invariant X may be taken as “the (ultra) 
elementary constituent of perception, which extends in the time-series over 
what may be called a duration, which since it is sui generis cannot be defined 
in terms of anything else.”’!2 


Adopting the words of A. N. Whitehead,?3 the ‘pure interval’ (which 
is the alpha-phoneme) is the “primitive experience ‘vector feeling’, that is 
to say, feeling from a beyond which is determinate and pointing to a beyond 
(the alpha-phonoid) which is to be determined.” 


There is no obtainable evidence from a psychological study of the 
Passage of time in individual experience to prove that the notion of an 
‘instant’ (a duration) corresponds to anything in the world of sense-data."4 
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II 


Before giving a detailed account of acoustical structure of music and 
speech it is necessary to explain some basic ideas in the theory of analysis 
of speech oscillations. 


In order to investigate music and speech processes thoroughly, we 
endeavour to experiment into the structure and to resolve the effect into 
individual elements or quanta. In seeking the appropriate quantal entity 
which is to appear in the ultimate analysis without eliminating completely 
the raster limits we find that music and speech dynamics presents sub- 
stantially a more difficult task than does e.g. static structure of crystals. 
Music and Speech are continuous entities and if they are cut into individual 
elements of sound, the isolated elements give a distorted and uncharacteristic 
impression of flow. The choice of quantum is therefore empirical and it 
must depend on varying characteristics of individual audible sounds. 
There are no definite borders of the sound elements to be chosen from the 
flow of music and speech since each is influenced by neighbouring elements; 
one is thus obliged to choose a fictitious unit determined by musicological 
or linguistic properties, but this has then little correspondence with physical 
properties of the sound. An elementary quantity of this kind will be called 
a note in music or a phoneme in speech.15 


On the other hand it has been found possible in Phonetics to analyse 
imperceptibly different sounds; the process can be carried to ultimate 
physical limits. This subdivision is necessary when the full-time and fre- 
quency range must be covered in the description of the spectrum of natural 
sounds; without such a complete description, the aural phenomena connected 
with the initiation and cessation of sounds cannot be understood. 


We now briefly review the analysis of stationary (i.e. non-varying) 
sound phenomena; strictly, of course such phenomena do not happen in 
nature and they are therefore irrelevant to the discussions on music and 
speech, but this will lead us to the methods of making acoustical measure- 
ments on the vocal tract; finally we come to foundations of the formant 
theory in its present-day form. 


Fourier’s theorem allows us to understand a complex tone as 2 
synthesis of sounds from harmonic partials. The frequencies of the har- 
monics are integral multiples of the fundamental. The intensities of the 
fundamental along with their partials is termed ‘Spectrum of a complex 
tone’. The spectrum gives no indication of the relative phases of the har- 
monics. However, different phase-relationships between the same har- 
monic Components gives an altogether different wave-shape of a complex 
tone on an oscilloscope. Hence oscilloscope presentation is not very suitable 
for the analytic assessment of sounds, apart from certain special purposes. 
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It is certainly true that phase relations have no significance in the 
study of music and speech-structures. 


Of course, the wide-spread view that the ear carries out a Fourier 
analysis in the cochlea is now disputed; it is considered to be valid for at 
most a limited range of frequencies, as shown by Mol and others (1963). 
Mol further points out that Fourier analysis also might be true only in 
limited range of frequencies. 


Naturally we are led to enquire what the other methods of analysis of 
a complex tone are possible. 


In this connection the auto-correlation function P(?) and the power 
density spectrum 9 (ev) are of particular interest; they are Fourier trans- 
forms of each other, according to the Wiener-Khintvhine theorem:!6 


The music or speech signal x (t) is delayed by the interval and the 
product x (t) x (t +-T ) is formed and integrated. 


K. Stevens was one of the pioneers to work out a simple instrumenta- 
tion for this purpose.*7 


In auto-correlation analysis of speech, we can give a “‘short time” 
auto-correlation function by integrating over a finite time interval only. 
If this integration process on the product say f(t). f (t + ) is allowed to 
operate continuously we obtain a running @ (7) that varies with time, 
just as the speech spectrum does. The auto-correlation function obtained 
in this way will be very nearly equal the Fourier transform of the spectrum. 


Measurement of this short time auto-correlation has been achieved 
in several ways. The accuracy of frequency determination by the auto- 
correlation function is a better method for visual display than other methods 
demonstrated so far (particularly for fricatives in speech), As early as 1955, 
Taskar had also calculated power spectra and auto-correlation functions 
of vowels in the wake of theoretical work of the senior author of the present 
paper. 


Licklider?8 has made yet another approach to throw some more 
light on structure in Speech. He distorted the wave amplitude using the 
technique of infinite peak clippings; even though the result is a square wave 
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form, the articulation is maintained at 70% (Pollack and Licklider, 1948), 
The comprehensibility can be further increased by differentiating the curve. 


In this experiment only the rhythmical succession of zeroes crossing 
the abscissa is retained after abridgment, of all the information contained 
in the original speech waves. 


Since infinite clipping dichotomizes the amplitude dimensions and 
no further reduction can be achieved in amplitude, it is necessary to 
operate upon temporal pattern if the speech-wave is to be further 
simplified. The simplifying operation had been done by quantization 
of the time scale. In quantized time, a rectangular wave can 
switch only at predetermined instants. Auto-correlation functions of these 
amplitude dichotomized, time quantized speech-waves were calculated; . 
also articulation test of these speech waves was carried out. Consequently, 
a theory of Hearing process might be constructed on quite a different basis 
from the purely physical method of spectral analyses. 


Il 


Conclusions 


Such are the few technical limitations in the study of music and speech 
that we have discussed. What is necessary now is to examine these and 
several other techniques for analyses of both music and speech in the light 
of the alpha-phoneme-phonoid theory. The alpha-phonoid has already 
been defined as the minimum common duration of a ‘unit information cell’ 
in the physical stimulus as well as in the neurological and psychological 
responses. In other words, the minimum common duration of all these 
three will serve as the key ‘interval’ for the basic representation of both the 
structure in speech and sruti which is the microtonal ‘interval’ in music. 


__ The conceptual and technical difficulties facing an attempt to deter- 
mine the “ultimate Unit” of music or speech as the minimum common 
duration of the physical stimulus, neuro-physiological and psychological 
components of the total event seem formidable in the present state of the 
brain sciences, especially as they bear on such high level functions as human 
music and speech. The lack of functional-structural clarity in this field is 
reflected in the sporadic and fragmentary attempts at studying the musical 
and speech processes as in terms of brain function.!9 


The following are some attempts in recent times: 
So called “command potentials” (averaged evoked potential of surface 


electr o-encephalograms i.e. EEG preceding various voluntary activities) 
was shown to differ in shape for phoneme O, T, and P (but not for numerals 
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2 and 10 or for the words ‘yes’ or ‘no’) when recorded over the left temporal 
speech area. 


The peripheral nervous system involvement in the speech process 
(the orienting reflex, recorded by a new plethysmographic technique dis- 
tinguishing orienting from defensive reaction by a vaso-motor criterion) 
led two Russian workers to postulate a semantic model of word nuclei 
surrounded by a semantic field of words linked to the nucleus by experi- 
mental meaning relations rather than logical content.2° 


Analysing human behaviour including verbal behaviour from sound 
motion pictures allowed two American workers to study ‘motor phonetics’ 
with the help of a ‘basic form of “‘unit-in-change” or “‘process unit” by 
which they were able to describe the ‘on-going flow’, of “moments-of-sus- 
taining-together of the body parts” in continual sequences of change. 


These are the widely different techniques and instrumental approaches 
that are being employed, as Bjorn Merker rightly observes in the objecti- 
fication of the semantic process and these varied techniques and experi- 
mental investigations might be relevant to the investigations of the alpha- 
phonoid which is qualitatively defined as the minimum common duration 
of the physical stimulus, as well as the neurophysiological and psychological 
response-components of a total event. 


Bjorn Merker also remarks that the above-mentioned objectifications 
of the semantic process are the surface or overt event of the music and 
speech processes while the definition of the alpha-phonoid goes further, 
as indeed one must, in order to distinguish human speech and music from 
animal calls so as to include the psychological component at high levels of 
abstraction and integration. In terms of the brain, this points to the con- 
vergent structures of the limbic system which is subcortical and accessible 
only by implanted depth electrodes. H. Lesse and R. G. Heath have 
recorded electrical activity in limbic structures in humans directly correlated 
with emotional thought and recall, but intracranial recording is not a 
generally applicable technique in humans. 


These technical problems are further complicated by the theoretical 
consideration that the information content of a stimulus event cannot be 
defined independently of the nature of the information processing and storage 
capacity, which in man is shaped by accumulating ontogenetic experience 
to an unusual degree, resulting in great individual and temporal variation. 


The very magnitude of this problem makes it the more challenging 
a task and therefore, the theoretical formulation of the alpha-phonoid seems 
to provide, as Bjorn Merker thinks, a provocative perspective within which 
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one may strive to approach a solution. The results of any deep experi- 
mental investigation within this perspective, might find too an application 
within phylobiology when that baby science, to quote once again Bjorn 
Merker, “reaches the maturity ot speech and concrete manipulations.” 
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